Import data

subedgar20160110<-read.csv("/Users/sn0wfree/Dropbox/InvestorAttention_data/R_datashow/log20160111.csv",header=1)

Show the First 10 Rows of log20160111

head(subedgar20160110, n=10)

Caption (Brief):

Variable name Discription Variable name Discription
IP Address IP (###.###.###.xxx) Size The size of the requested documents.
Date Apache log file date Idx Dummy variable: 1 for the requesters come from the index page, and zero otherwise.
time Apache log file time Norefer 1 for the referrer field is empty; and zero otherwise
zone Apache log file zone Noagent 1 for the user agent field is empty; and zero otherwise
cik SEC Central Index Key (CIK)/ Company Code/Indidividual code. Find The category variable (0-10), categoried by the characters found in the referrer field, Detail please see below: Caption (Detail)
accession the SEC document accession number of requested documents. crawler 1 for non-robots; 0 for robots.
extention the extention name of requested file, or CIK if miss the extention name. browser Category variables (string variables), to show the which device or internet browser the requester use
Code Apache log file status code for the request

Caption (Detail):

  • IP : This variable provides the first three octets of the IP address with the fourth octet obfuscated with a 3 character string that preserves the uniqueness of the last octet without revealing the full identity of the IP (###.###.###.xxx). For example, all fourth octets of 150 will have the same three character string across all files
  • Date: Apache log file date
  • time: Apache log file time
  • zone: Apache log file zone
  • cik: SEC Central Index Key (CIK) associated with the document requested: company code
  • accession: SEC document accession number associated with the document requested
  • extention: doc: This variable provides the filename of the file requested including the document extension. If the filename is missing and only the file extension is present, then the filename is the document accession number.
  • Code: Apache log file status code for the request
  • Size: document file size
  • Idx: takes on a value of 1 if the requester landed on the index page of a set of documents (e.g., - index.htm), and zero otherwise

  • Norefer: takes on a value of one if the Apache log file referrer field is empty, and zero otherwise

  • Noagent: takes on a value of one if the Apache log file user agent field is empty, and zero otherwise

  • Find: numeric values from 0 to 10, that correspond to whether the following character strings/[$string]/were found in the referrer field – this could indicate how the document requester arrived at the document link (e.g., internal EDGAR search):
  1. $find=0;
  2. if(\(referrer=~m/.*(action\=getcompany)/){\)find=1};
  3. if(\(referrer=~m/.*(action\=getcurrent)/){\)find=2};
  4. if(\(referrer=~m/.*(Find\+Companies)/){\)find=3};
  5. if(\(referrer=~m/.*(cgi\-bin\/srch\-edgar)/){\)find=4};
  6. if(\(referrer=~m/.*(EDGARFSClient)/){\)find=5};
  7. if(\(referrer=~m/.*(cgi\-bin\/current)/){\)find=6};
  8. if(\(referrer=~m/.*(Archives\/edgar)/){\)find=7};
  9. if(\(referrer=~m/.*(cgi\-bin\/viewer)/){\)find=8};
  10. if(\(referrer=~m/.*(.*\-index)/){\)find=9};

crawler: This variable takes on a value of one if the user agent self-identifies as one of the following webcrawlers or has a user code of 404. Below are the actual Perl regular expressions used: 1. if(\(agent=~m/(wget|Googlebot|polybot|Yahoo\!\s*Slurp|spider|robot|perl|python|lwp|crawler)/i){\)crawl=1}; 2. if(\(code==404){\)crawl=1}; browser: This variable is a three character string that identifies potential browser type by analyzing whether the user agent field contained the following /[text]/. Below are the actual Perl regular expressions used: 1. if(\(agent=~m/MSIE/){\)browser=“mie”}; 2. if(\(agent=~m/Firefox/){\)browser=“fox”}; 3. if(\(agent=~m/Safari/){\)browser=“saf”}; 4. if(\(agent=~m/Chrom/){\)browser=“chr”};
5. if(\(agent=~m/Seamonk/){\)browser=“sea”}; 6. if(\(agent=~m/Opera/){\)browser=“opr”}; 7. if(\(agent=~m/(DoCoMo|KDDI|Cricket|Vodaphone)/){\)browser=“oth”}; 8. if(\(agent=~m/Windows\s*NT/){\)browser=“win”}; 9. if(\(agent=~m/Mac\s*OS/i){\)browser=“mac”}; 10. if(\(agent=~m/Linux/i){\)browser=“lin”}; 11. if(\(agent=~m/iPhone/){\)browser=“iph”}; 12. if(\(agent=~m/iPad/){\)browser=“ipd”}; 13. if(\(agent=~m/Android/){\)browser=“and”}; 14. if(\(agent=~m/(BB10|PlayBook|BlackBerry)/){\)browser=“rim”}; 15. if(\(agent=~m/(IEMobile|Windows\s*CE|Windows\s*Phone)/){\)browser=“iem”};

LS0tCnRpdGxlOiAiVmFyaWFibGUgZXhwbGFuYXRpb24gZm9yIEVER0FSIgphdXRob3I6ICJzbjB3ZnJlZSIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQojIEltcG9ydCBkYXRhCmBgYHtyfQojc3ViZWRnYXIyMDE2MDExMDwtcmVhZC5jc3YoIi9Vc2Vycy9zbjB3ZnJlZS9Ecm9wYm94L0ludmVzdG9yQXR0ZW50aW9uX2RhdGEvUl9kYXRhc2hvdy9sb2cyMDE2MDExMS5jc3YiLGhlYWRlcj0xKQpgYGAKCiMgU2hvdyB0aGUgRmlyc3QgMTAgUm93cyBvZiBsb2cyMDE2MDExMQpgYGB7cn0KaGVhZChzdWJlZGdhcjIwMTYwMTEwLCBuPTEwKQpgYGAKIyMgQ2FwdGlvbiAoQnJpZWYpOgoKCgoKfFZhcmlhYmxlIG5hbWUgIHwgRGlzY3JpcHRpb24gICAgICAgICAgICAgfFZhcmlhYmxlIG5hbWUgIHwgRGlzY3JpcHRpb258Cnw6LS0tLS0tLS0tLS0tLTp8Oi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLXw6LS0tLS0tLS0tLS0tLTp8Oi0tLS0tLS0tLS0tLXwKfElQIEFkZHJlc3MgICAgIHwgKipJUCoqICgjIyMuIyMjLiMjIy54eHgpfCBTaXplICAgICAgICAgIHwgVGhlIHNpemUgb2YgdGhlIHJlcXVlc3RlZCBkb2N1bWVudHMuIHwgCnwgRGF0ZSAgICAgICAgICB8IEFwYWNoZSBsb2cgZmlsZSAqKmRhdGUqKnwgSWR4ICAgICAgICAgICB8IER1bW15IHZhcmlhYmxlOiAqKjEqKiBmb3IgdGhlIHJlcXVlc3RlcnMgY29tZSBmcm9tIHRoZSAgKippbmRleCBwYWdlKiosIGFuZCB6ZXJvIG90aGVyd2lzZS58IAp8IHRpbWUgICAgICAgICAgfCBBcGFjaGUgbG9nIGZpbGUgKip0aW1lKip8IE5vcmVmZXIgICAgICAgfCAxIGZvciB0aGUgcmVmZXJyZXIgZmllbGQgaXMgZW1wdHk7ICBhbmQgemVybyBvdGhlcndpc2V8IAp8IHpvbmUgICAgICAgICAgfCBBcGFjaGUgbG9nIGZpbGUgKip6b25lKip8IE5vYWdlbnQgICAgICAgfCAxIGZvciB0aGUgdXNlciBhZ2VudCBmaWVsZCBpcyBlbXB0eTsgYW5kIHplcm8gb3RoZXJ3aXNlfCAKfCBjaWsgICAgICAgICAgIHwgKipbU0VDIENlbnRyYWwgSW5kZXggS2V5XShodHRwczovL3d3dy5zZWMuZ292L2RpdmlzaW9ucy9jb3JwZmluL29yZ2FuaXphdGlvbi9jZmlhLTEyMy5odG0pKiogKENJSykvIENvbXBhbnkgQ29kZS9JbmRpZGl2aWR1YWwgY29kZS58ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgRmluZCB8IFRoZSBjYXRlZ29yeSB2YXJpYWJsZSAoMC0xMCksIGNhdGVnb3JpZWQgYnkgdGhlIGNoYXJhY3RlcnMgZm91bmQgaW4gdGhlIHJlZmVycmVyIGZpZWxkLCBEZXRhaWwgcGxlYXNlIHNlZSBiZWxvdzogW0NhcHRpb24gKERldGFpbCldKCMjIENhcHRpb24gKERldGFpbCkpfCAKfCBhY2Nlc3Npb24gICAgIHwgdGhlICoqU0VDIGRvY3VtZW50KiogYWNjZXNzaW9uICoqbnVtYmVyKiogb2YgcmVxdWVzdGVkIGRvY3VtZW50cy4gfCBjcmF3bGVyfCAxIGZvciBub24tcm9ib3RzOyAwIGZvciByb2JvdHMuIHwKfGV4dGVudGlvbnwgdGhlICoqZXh0ZW50aW9uIG5hbWUqKiBvZiByZXF1ZXN0ZWQgZmlsZSwgb3IgKipDSUsqKiBpZiBtaXNzIHRoZSBleHRlbnRpb24gbmFtZS58IGJyb3dzZXJ8IENhdGVnb3J5IHZhcmlhYmxlcyAoc3RyaW5nIHZhcmlhYmxlcyksIHRvIHNob3cgdGhlIHdoaWNoIGRldmljZSBvciBpbnRlcm5ldCBicm93c2VyIHRoZSByZXF1ZXN0ZXIgdXNlfAp8IENvZGV8IEFwYWNoZSBsb2cgZmlsZSAqKnN0YXR1cyBjb2RlKiogZm9yIHRoZSByZXF1ZXN0fCB8IHwgIHwgCgoKCgoKCgotLS0tLS0tLS0tLSAKIyMgQ2FwdGlvbiAoRGV0YWlsKToKCisgSVAgOiBUaGlzIHZhcmlhYmxlIHByb3ZpZGVzIHRoZSBmaXJzdCB0aHJlZSBvY3RldHMgb2YgdGhlIElQIGFkZHJlc3Mgd2l0aCB0aGUgZm91cnRoIG9jdGV0IG9iZnVzY2F0ZWQgd2l0aCBhIDMgY2hhcmFjdGVyIHN0cmluZyB0aGF0IHByZXNlcnZlcyB0aGUgdW5pcXVlbmVzcyBvZiB0aGUgbGFzdCBvY3RldCB3aXRob3V0IHJldmVhbGluZyB0aGUgZnVsbCBpZGVudGl0eSBvZiB0aGUgSVAgKCMjIy4jIyMuIyMjLnh4eCkuICBGb3IgZXhhbXBsZSwgYWxsIGZvdXJ0aCBvY3RldHMgb2YgMTUwIHdpbGwgaGF2ZSB0aGUgc2FtZSB0aHJlZSBjaGFyYWN0ZXIgc3RyaW5nIGFjcm9zcyBhbGwgZmlsZXMKKyBEYXRlOiBBcGFjaGUgbG9nIGZpbGUgZGF0ZQorIHRpbWU6IEFwYWNoZSBsb2cgZmlsZSB0aW1lCisgem9uZTogQXBhY2hlIGxvZyBmaWxlIHpvbmUKKyBjaWs6IFNFQyBDZW50cmFsIEluZGV4IEtleSAoQ0lLKSBhc3NvY2lhdGVkIHdpdGggdGhlIGRvY3VtZW50IHJlcXVlc3RlZDogY29tcGFueSBjb2RlCisgYWNjZXNzaW9uOiBTRUMgZG9jdW1lbnQgYWNjZXNzaW9uIG51bWJlciBhc3NvY2lhdGVkIHdpdGggdGhlIGRvY3VtZW50IHJlcXVlc3RlZAorIGV4dGVudGlvbjogZG9jOiBUaGlzIHZhcmlhYmxlIHByb3ZpZGVzIHRoZSBmaWxlbmFtZSBvZiB0aGUgZmlsZSByZXF1ZXN0ZWQgaW5jbHVkaW5nIHRoZSBkb2N1bWVudCBleHRlbnNpb24uICBJZiB0aGUgZmlsZW5hbWUgaXMgbWlzc2luZyBhbmQgb25seSB0aGUgZmlsZSBleHRlbnNpb24gaXMgcHJlc2VudCwgdGhlbiB0aGUgZmlsZW5hbWUgaXMgdGhlIGRvY3VtZW50IGFjY2Vzc2lvbiBudW1iZXIuCisgQ29kZTogQXBhY2hlIGxvZyBmaWxlIHN0YXR1cyBjb2RlIGZvciB0aGUgcmVxdWVzdAorIFNpemU6IGRvY3VtZW50IGZpbGUgc2l6ZQorIElkeDogdGFrZXMgb24gYSB2YWx1ZSBvZiAxIGlmIHRoZSByZXF1ZXN0ZXIgbGFuZGVkIG9uIHRoZSBpbmRleCBwYWdlIG9mIGEgc2V0IG9mIGRvY3VtZW50cyAoZS5nLiwgLSBpbmRleC5odG0pLCBhbmQgemVybyBvdGhlcndpc2UKCisgTm9yZWZlcjogdGFrZXMgb24gYSB2YWx1ZSBvZiBvbmUgaWYgdGhlIEFwYWNoZSBsb2cgZmlsZSByZWZlcnJlciBmaWVsZCBpcyBlbXB0eSwgYW5kIHplcm8gb3RoZXJ3aXNlCgorIE5vYWdlbnQ6IHRha2VzIG9uIGEgdmFsdWUgb2Ygb25lIGlmIHRoZSBBcGFjaGUgbG9nIGZpbGUgdXNlciBhZ2VudCBmaWVsZCBpcyBlbXB0eSwgYW5kIHplcm8gb3RoZXJ3aXNlCgorIEZpbmQ6IG51bWVyaWMgdmFsdWVzIGZyb20gMCB0byAxMCwgdGhhdCBjb3JyZXNwb25kIHRvIHdoZXRoZXIgdGhlIGZvbGxvd2luZyBjaGFyYWN0ZXIgc3RyaW5ncy9bJHN0cmluZ10vd2VyZSBmb3VuZCBpbiB0aGUgcmVmZXJyZXIgZmllbGQg4oCTIHRoaXMgY291bGQgaW5kaWNhdGUgaG93IHRoZSBkb2N1bWVudCByZXF1ZXN0ZXIgYXJyaXZlZCBhdCB0aGUgZG9jdW1lbnQgbGluayAoZS5nLiwgaW50ZXJuYWwgRURHQVIgc2VhcmNoKToKMS4JJGZpbmQ9MDsKMi4JaWYoJHJlZmVycmVyPX5tLy4qKGFjdGlvblw9Z2V0Y29tcGFueSkvKXskZmluZD0xfTsgICAgICAgICAgICAKMy4JaWYoJHJlZmVycmVyPX5tLy4qKGFjdGlvblw9Z2V0Y3VycmVudCkvKXskZmluZD0yfTsgICAgICAgICAgICAKNC4JaWYoJHJlZmVycmVyPX5tLy4qKEZpbmRcK0NvbXBhbmllcykvKXskZmluZD0zfTsgICAgICAgICAgICAgICAKNS4JaWYoJHJlZmVycmVyPX5tLy4qKGNnaVwtYmluXC9zcmNoXC1lZGdhcikvKXskZmluZD00fTsgICAgICAgICAKNi4JaWYoJHJlZmVycmVyPX5tLy4qKEVER0FSRlNDbGllbnQpLyl7JGZpbmQ9NX07ICAgICAgICAgICAgICAgICAKNy4JaWYoJHJlZmVycmVyPX5tLy4qKGNnaVwtYmluXC9jdXJyZW50KS8peyRmaW5kPTZ9OyAgICAgICAgICAgICAKOC4JaWYoJHJlZmVycmVyPX5tLy4qKEFyY2hpdmVzXC9lZGdhcikvKXskZmluZD03fTsgICAgICAgICAgICAgICAKOS4JaWYoJHJlZmVycmVyPX5tLy4qKGNnaVwtYmluXC92aWV3ZXIpLyl7JGZpbmQ9OH07ICAgICAgICAgICAgICAKMTAuCWlmKCRyZWZlcnJlcj1+bS8uKiguKlwtaW5kZXgpLyl7JGZpbmQ9OX07CgpjcmF3bGVyOiBUaGlzIHZhcmlhYmxlIHRha2VzIG9uIGEgdmFsdWUgb2Ygb25lIGlmIHRoZSB1c2VyIGFnZW50IHNlbGYtaWRlbnRpZmllcyBhcyBvbmUgb2YgdGhlIGZvbGxvd2luZyB3ZWJjcmF3bGVycyBvciBoYXMgYSB1c2VyIGNvZGUgb2YgNDA0LiBCZWxvdyBhcmUgdGhlIGFjdHVhbCBQZXJsIHJlZ3VsYXIgZXhwcmVzc2lvbnMgdXNlZDoKMS4JaWYoJGFnZW50PX5tLyh3Z2V0fEdvb2dsZWJvdHxwb2x5Ym90fFlhaG9vXCFccypTbHVycHxzcGlkZXJ8cm9ib3R8cGVybHxweXRob258bHdwfGNyYXdsZXIpL2kpeyRjcmF3bD0xfTsKMi4JaWYoJGNvZGU9PTQwNCl7JGNyYXdsPTF9Owpicm93c2VyOiBUaGlzIHZhcmlhYmxlIGlzIGEgdGhyZWUgY2hhcmFjdGVyIHN0cmluZyB0aGF0IGlkZW50aWZpZXMgcG90ZW50aWFsIGJyb3dzZXIgdHlwZSBieSBhbmFseXppbmcgd2hldGhlciB0aGUgdXNlciBhZ2VudCBmaWVsZCBjb250YWluZWQgdGhlIGZvbGxvd2luZyAvW3RleHRdLy4gIEJlbG93IGFyZSB0aGUgYWN0dWFsIFBlcmwgcmVndWxhciBleHByZXNzaW9ucyB1c2VkOgoxLglpZigkYWdlbnQ9fm0vTVNJRS8peyRicm93c2VyPSJtaWUifTsKMi4JaWYoJGFnZW50PX5tL0ZpcmVmb3gvKXskYnJvd3Nlcj0iZm94In07CjMuCWlmKCRhZ2VudD1+bS9TYWZhcmkvKXskYnJvd3Nlcj0ic2FmIn07CjQuCWlmKCRhZ2VudD1+bS9DaHJvbS8peyRicm93c2VyPSJjaHIifTsgICAgICAgICAgICAgICAgICAgIAo1LglpZigkYWdlbnQ9fm0vU2VhbW9uay8peyRicm93c2VyPSJzZWEifTsKNi4JaWYoJGFnZW50PX5tL09wZXJhLyl7JGJyb3dzZXI9Im9wciJ9Owo3LglpZigkYWdlbnQ9fm0vKERvQ29Nb3xLRERJfENyaWNrZXR8Vm9kYXBob25lKS8peyRicm93c2VyPSJvdGgifTsKOC4JaWYoJGFnZW50PX5tL1dpbmRvd3NccypOVC8peyRicm93c2VyPSJ3aW4ifTsKOS4JaWYoJGFnZW50PX5tL01hY1xzKk9TL2kpeyRicm93c2VyPSJtYWMifTsKMTAuCWlmKCRhZ2VudD1+bS9MaW51eC9pKXskYnJvd3Nlcj0ibGluIn07CjExLglpZigkYWdlbnQ9fm0vaVBob25lLyl7JGJyb3dzZXI9ImlwaCJ9OwoxMi4JaWYoJGFnZW50PX5tL2lQYWQvKXskYnJvd3Nlcj0iaXBkIn07CjEzLglpZigkYWdlbnQ9fm0vQW5kcm9pZC8peyRicm93c2VyPSJhbmQifTsKMTQuCWlmKCRhZ2VudD1+bS8oQkIxMHxQbGF5Qm9va3xCbGFja0JlcnJ5KS8peyRicm93c2VyPSJyaW0ifTsKMTUuCWlmKCRhZ2VudD1+bS8oSUVNb2JpbGV8V2luZG93c1xzKkNFfFdpbmRvd3NccypQaG9uZSkvKXskYnJvd3Nlcj0iaWVtIn07Cg==